refactor(render): scheme owns bundle rendering [LTV-Pe]#108
Merged
Conversation
Completes the second half of the generation-scheme seam against the known-good lead-scoring path, so the abstraction is fully formed before any file moves or lifecycle work. Pure refactor — lead-scoring bundle output is byte-identical. - GenerationScheme protocol gains write_bundle(bundle, path, generation_timestamp). - LeadScoringScheme.write_bundle now owns the lead-scoring on-disk shape (9-table relational export incl. snapshot-safe projection, lead snapshot + task splits, dataset card, feature dictionary, exposure metadata, manifest), reusing the already-modular shared envelope helpers (build_manifest, apply_exposure, get_filter). - api/bundle.py::write_bundle is now a thin dispatcher on bundle.spec.scheme; WorldBundle.save() delegates to the producing scheme. The historical write_bundle(bundle, path) entry point is preserved for existing callers. - Registration footgun fixed: get_scheme/available_schemes lazily trigger builtin registration (_ensure_builtins), so resolution no longer depends on importing the leadforge.schemes package first — `from leadforge.schemes.base import get_scheme` now works on its own. Roadmap reordered (docs/ltv/roadmap.md): the render seam (LTV-Pe) is sequenced before the physical module move (now LTV-Pf), so the move relocates a complete, both-halves scheme and bundle.py's call sites change only once. Downstream PR codes shifted to Pa..Ps (19 PRs). Verified byte-identical to main (14/14 files of a pinned-timestamp bundle). Tests: tests/schemes/test_render_dispatch.py (5) — save dispatch, unknown-scheme on write, unpopulated RuntimeError, render-path determinism, and base-direct resolution (footgun guard, subprocess). Full suite 1503 passed / 51 skipped; ruff + mypy clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Check off LTV-Pe, link PR #108, advance status. Next: LTV-Pf (physical move). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Pull request overview
Refactors bundle rendering so that each generation scheme owns its on-disk bundle shape, with api/bundle.py becoming a thin dispatcher based on bundle.spec.scheme. This completes the “render half” of the GenerationScheme seam while aiming to keep lead-scoring bundle output byte-identical, and it hardens scheme resolution against import-order issues via lazy built-in registration.
Changes:
- Add
write_bundle(bundle, path, generation_timestamp)to theGenerationSchemeprotocol and implement it onLeadScoringScheme. - Make
leadforge.api.bundle.write_bundledelegate to the resolved scheme instead of orchestrating rendering directly. - Ensure built-in scheme registration is triggered lazily on
get_scheme()/available_schemes(); add tests for dispatch, determinism, and import-order independence.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
leadforge/schemes/base.py |
Extends the scheme protocol with write_bundle and adds lazy built-in registration on scheme resolution. |
leadforge/schemes/lead_scoring/__init__.py |
Moves the lead-scoring bundle-writing orchestration into LeadScoringScheme.write_bundle. |
leadforge/api/bundle.py |
Simplifies bundle writing into a scheme-based dispatch function. |
tests/schemes/test_render_dispatch.py |
Adds coverage for save/write dispatch, unknown-scheme handling, determinism, and the import-order footgun fix. |
docs/ltv/roadmap.md |
Updates LTV roadmap sequencing and PR codes to reflect the render-seam-first plan. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+112
to
+118
| global _builtins_loaded | ||
| if _builtins_loaded: | ||
| return | ||
| _builtins_loaded = True | ||
| import importlib | ||
|
|
||
| importlib.import_module("leadforge.schemes") |
Comment on lines
+238
to
+245
| (root / "dataset_card.md").write_text( | ||
| render_dataset_card( | ||
| bundle.spec, | ||
| task_manifest=task, | ||
| table_counts=table_row_counts, | ||
| features=visible_features, | ||
| ) | ||
| ) |
…self-review) [LTV-Pe]
Hostile self-review of the first revision: it relocated the ~110-line
write_bundle monolith into LeadScoringScheme.write_bundle and the docstrings
over-claimed that a shared "envelope" was being reused, when in fact the write
orchestration was copied — lifecycle would duplicate ~60% of it.
Two honest corrections (no behaviour change; byte-identical preserved):
- Extract the one genuinely scheme-agnostic, safe-to-share step:
`write_relational_tables(dfs, tables_dir, redacted) -> {name: rows}` in
leadforge/render/relational.py (redaction-drop + parquet-write + row-count
loop). LeadScoringScheme uses it; LifecycleScheme will too instead of copying.
- Stop over-claiming. Docstrings (api/bundle.py, base protocol,
LeadScoringScheme.write_bundle) now state plainly that each scheme currently
orchestrates its OWN full write; the deeper envelope/hook decomposition is
deferred to LTV-M6 *because* build_manifest (takes world_graph) and
apply_exposure (writes the lead-scoring hidden graph + latent registry) are
still lead-scoring-coupled and the decomposition is best designed with a
second scheme in hand.
Tests: tests/render/test_write_relational_tables.py (5) — per-table parquet +
row counts, nested-dir creation, redaction drop, no-redaction passthrough,
iteration-order preservation. Full suite 1508 passed / 51 skipped; ruff + mypy
clean; verified byte-identical to main (14/14 files).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
pr-agent-context report: This run includes unresolved review comments on PR #108 in repository https://github.com/leadforge-dev/leadforge
For each unresolved review comment, recommend one of: resolve as irrelevant, accept and implement
the recommended solution, open a separate issue and resolve as out-of-scope for this PR, accept and
implement a different solution, or resolve as already treated by the code.
After I reply with my decision per item, implement the accepted actions, resolve the corresponding
PR comments, and push all of these changes in a single commit.
# Copilot Comments
## COPILOT-1
Location: leadforge/schemes/base.py:120
URL: https://github.com/leadforge-dev/leadforge/pull/108#discussion_r3387279835
Root author: copilot-pull-request-reviewer
Comment:
`_ensure_builtins` sets `_builtins_loaded = True` before attempting the import. If `import_module('leadforge.schemes')` raises (or if multiple threads call this concurrently), the flag can be left `True` while built-ins are not actually registered, causing `get_scheme()` / `available_schemes()` to fail with `UnknownSchemeError` even though the package would be importable on retry. Set the flag only after a successful import so failures don’t poison the registry state.
## COPILOT-2
Location: leadforge/schemes/lead_scoring/__init__.py:249
URL: https://github.com/leadforge-dev/leadforge/pull/108#discussion_r3387279907
Root author: copilot-pull-request-reviewer
Comment:
`Path.write_text(...)` defaults to the platform encoding. Since the bundle render path is meant to be byte-deterministic (and this file is part of that output), it’s safer to pin `encoding='utf-8'` (and optionally `newline='\n'`) to avoid locale-dependent bytes on systems where the default isn’t UTF-8.Run metadata: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Completes the second half of the generation-scheme seam (
LTV-Pe,milestone
LTV-M2) against the known-good lead-scoring path, so theabstraction is fully formed before any file moves or lifecycle work. Pure
refactor — lead-scoring bundle output is byte-identical.
This reorders the roadmap on purpose: the render path is where schemes diverge
most (9-table vs 6-table, lead vs customer snapshot, classification vs
regression splits), so it's designed here against lead-scoring with
byte-identity as the oracle — not later, co-developed with an unfamiliar
lifecycle pipeline.
Changes
write_bundle(bundle, path, generation_timestamp).LeadScoringScheme.write_bundlenow owns the lead-scoring on-disk shape(relational export incl. the snapshot-safe projection, lead snapshot + task
splits, dataset card, feature dictionary, exposure metadata, manifest),
reusing the already-modular shared helpers (
build_manifest,apply_exposure,get_filter).api/bundle.py::write_bundleis now a thin dispatcher onbundle.spec.scheme;WorldBundle.save()delegates to the producing scheme.The historical
write_bundle(bundle, path)entry point is preserved.get_scheme/available_schemeslazily trigger builtin registration, soresolution no longer depends on importing the
leadforge.schemespackagefirst —
from leadforge.schemes.base import get_schemenow works on its own.Roadmap reorder
LTV-Pe= render seam (this PR);LTV-Pf= physical module move (wasPe);LTV-Pg= scaffold lifecycle (wasPf). Downstream codes shifted toPa..Ps(19 PRs). Same destination — the move now relocates a completescheme and
bundle.py's call sites change only once.Verification
tests/schemes/test_render_dispatch.py(5) — save dispatch,unknown-scheme on write, unpopulated
RuntimeError, render-path determinism,base-direct resolution (footgun guard, subprocess).
ruff+mypyclean.BUNDLE_SCHEMA_VERSIONunchanged.🤖 Generated with Claude Code